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^t"! , Abstract 

The identification of the lag length for vector autoregressive models by mean 
' of Akaike Information Criterion (AIC), Partial Autoregressive and Correlation 

CZ3 ^ Matrices (PAM and PCM hereafter) is studied in the framework of processes 

with time varying variance. It is highlighted that the use of the standard tools 



o 

^' 
o 

(N 



X 



^ , are not justified in such a case. As a consequence we propose an adaptive AIC 

^ . 

, confidence bounds are proposed for the usual PAM and PCM obtained from 



which is robust to the presence of unconditional heteroscedasticity. Corrected 



the Ordinary Least Squares (OLS) estimation. The volatility structure of 
. the innovations is used to develop adaptive PAM and PCM. We underline 

that the adaptive PAM and PCM are more accurate than the OLS PAM and 
PCM for identifying the lag length of the autoregressive models. Monte Carlo 
experiments show that the adaptive AIC have a greater ability to select the 



^ ' correct autoregressive order than the standard AIC. An illustrative application 

■ 

using US international finance data is presented. 
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1. Introduction 

The analysis of time series using linear models is usually carried out following three 
steps. First the model is identified, then estimated and finally we proceed to the 
checking of the goodness-of-fit of the model (see Brockwell and Davis (1991, chapters 8 
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and 9)). Tools for the three phases in the specification-estimation- verification modehng 
cycle of time series with constant unconditional innovations variance are available in 
any of the specialized softwares as for instance R, SAS or JMulTi. The identification 
stage is important for the choice of a suitable model for the data. In this step the 
partial autoregressive and correlation matrices (PAM and PCM hereafter) are often 
used to identify VAR models with stationary innovations (see Tiao and Box (1981)). 
Information criteria are also extensively used. In the framework of stationary processes 
numerous information criteria have been studied (see e.g. Hannan and Quinn (1979), 
Cavanaugh (1997) or Boubacar Mainassara (2012)). One of the most commonly used 
information criterion is the Akaike Information Criterion (AIC) proposed by Akaike 
(1973). Nevertheless it is widely documented in the literature that the constant 
volatility assumption is unrealistic for many economic data. Reference can be made 
to Mc-Connell, Mosser and Perez-Quiros (1999), Kim and Nelson (1999), Stock and 
Watson (2002), Ahmed, Levin and Wilson (2002), Herrera and Pesavento (2005) or 
Davis and Kahn (2008). In this paper we investigate the lag length identification 
problem of autoregressive processes in the important case where the unconditional 
innovations volatility is time varying. 

The statistical inference of processes with non constant variance has recently at- 
tracted much attention. Sanso, Arago and Carrion (2004) or Galeano and Pena (2004) 
among other contributions proposed tests to detect volatility breaks in the residuals. 
Francq and Gautier (2004) studied the estimation of ARMA models with time varying 
parameters, allowing a finite number of regimes for the variance. Mikosch and Starica 
(2004) give some theoretical evidence that financial data may exihibit non constant 
variance. In the context of GARCH models reference can be made to the works of 
Kokoszka and Leipus (2000), Engle and Rangel (2005), Dahlhaus and Rao (2006) 
or Horvath, Kokoszka and Zhang (2006) who investigated the inference for processes 
with unconditional time varying variance. In the multivariate framework Bai (2000), 
Qu and Perron (2007) or Kim and Park (2010) among others studied models with 
unconditionally non constant variance. Xu and Phillips (2008) studied the estimation 
of univariate autoregressive models whose innovations have a non constant volatility. 
Patilea and Rai'ssi (2010) generalized their findings in the case of Vector AutoRegressive 
(VAR) models with time-varying volatility. In these works the asymptotic normality 
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of the Ordinary Least Squares (OLS) estimator of the autoregressive parameters is 
estabhshed. An important output of these papers is that the asymptotic covariance 
matrix obtained if one take into account of the non constant volatiHty can be quite 
different from the standard covariance matrix expression. As a consequence they 
also provided Adaptive Least Squares (ALS) estimators which achieve a more effi- 
cient estimation of the autoregressive parameters. Patilea and Rai'ssi (2011) proposed 
tools for checking the adequacy of the autoregressive order of VAR models when the 
unconditional variance is non constant. 

In this paper modified tools for lag length identification in the case of autoregressive 
processes with time-varying volatility are introduced. The unreliability of the use of 
the standard AIC for the identification step in VAR modeling in presence of non 
constant variance is first highlighted. Consequently a modified AIC based on the 
adaptive estimation of the non constant volatility structure is proposed. We establish 
the suitability of the adaptive AIC to identify the autoregressive order of non stationary 
but stable VAR processes through theoretical results and numerical illustrations. On 
the other hand it is also shown that the standard results on the OLS estimators of the 
PAM and PCM can be quite misleading. Consequently corrected confidence bounds 
are proposed. Using the adaptive approach more efficient estimators of the PAM 
and PCM are proposed. Therefore the identification tools proposed in this paper 
may be viewed as a complement of the above mentioned results on the estimation 
and diagnostic testing in the important framework of autoregressive models with non 
constant volatility. 

The structure of the paper is as follow. In Section [2] we define the model and 
introduce assumptions which give the general framework of our study. The asymptotic 
behavior of different estimators of the autoregressive parameters is given. We also 
describe the adaptive estimation of the volatility. In Section [3] it is shown that the 
standard AIC is irrelevant for model selection when the innovations volatility is not 
constant. The adaptive AIC is derived taking into account the time- varying variance in 
the Kullback-Leibler discrepancy. In Section [S] some Monte Carlo experiments results 
are given to examine the performances of the studied information criteria for VAR 
model identification in our non standard framework. We also investigate the lag length 
selection of a bivariate system of US international finance variables. 
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2. Estimation of the model 

In this paper we restrict our attention to VAR models since they are extensively 
used for the analysis of multivariate time series (see e.g. Liitkepohl (2005)). Let us 
consider the d-dimensional autoregressive process {X^) satisfying 

Xt^ AoiXt^i + --- + AopXt-p„+ut (2.1) 
Ut = HtCt, 

where the Aoj's, i G {1, . . . ,po}, are such that det{A{z)) ^ for all \z\ < 1, with 
A{z) = 1 — Y^^=i and det(.) denotes the determinant of a square matrix. We 

suppose that X_p+i, . . . ,Xo, . . . , X„ are observed with p > po. Now let us denote by 
[.] the integer part. For ease of exposition we shall assume that the process (ej) is iid 
standard Gaussian. Throughout the paper we assume that the following conditions on 
the volatility structure of the innovations process (ut) hold. 

Assumption Al: The dxd matrices Ht are positive definite and satisfy H^xr] = 
G{r), where the components of the matrix G{r) := {gki{i')} o,re measurable deter- 
ministic functions on the interval (0,1], such that supr£(o,i] Ifl'feK'')! < ^^'^ ^^^^ 
Qki satisfies a Lipschitz condition piecewise on a finite number of some sub-intervals 
that partition (0, 1] . The matrix S(r) = G{r)G{r)' is assumed positive definite for all r. 

The rescaling method of Dahlhaus (1997) is considered to specify the volatility 
structure in Assumption Al. Note that one should formally use the notation Xf ^ 
with < t < n and n e N. Nevertheless we do not use the subscript n to lighten the 
notations. This specification allow to consider kinds of time- varying variance which are 
commonly considered in the literature as for instance abrupt shifts, smooth transitions 
or periodic heteroscedasticity. Note that Sensier and Van Dijk (2004) found that 
approximately 80% among 214 US macro-economic data they investigated exhibit a 
volatility break. Starica (2003) hypothesized that the returns of the Standard and 
Poors 500 stock market index have a non constant unconditional volatility. Then 
considering the framework given by Al is important given the strong empirical evidence 



Identification of stable VAR models 



5 



of non-constant unconditional volatility in many macro-economic and financial data. 
Our assumption is similar to that of recent papers in the literature. For instance similar 
structure for the volatility was considered by Xu and Phillips (2008) or Kim and Park 
(2010) among others. Our framework encompass the important case of piecewise con- 
stant volatility as considered in Pesaran and Timmerman (2004) or Bai (2000). Boswijk 
and Zu (2007) allowed for stochastic effects in the non constant volatility structure but 
excluded the case abrupt shifts. Finally it is important to underline that the framework 
induced by Al is different from the case of autoregressive processes with conditionally 
heteroscedastic but (strictly) stationary errors. For instance the well known ARMA- 
GARCH models cannot take into account for non constant unconditional volatility in 
the innovations. The model identification problem for stationary processes which may 
display nonlinearities has been recently investigated by Boubacar Mainassara (2012) 
in a quite general framework. 

In this part we introduce estimators of the autoregressive parameters. Let us rewrite 
(Pl^) as follow 



Xt = {X[_^®h)eo + ut (2.2) 
ut = Htet, 

where 0q = (vec (Aqi)', . . . , vec (Aopo)')' G R^"''^ is the vector of the true autoregressive 
parameters and Xt-i — (A(_]^, . . . ,X[_p^)' . For a fitted autoregressive order p > po, 
the OLS estimator is given by 

OoLS = S;^'vec (±x) , 

where 

n n 

^x = '^'^Y.^t-iXti®Id and Ex =n-i^AtA|Li, 
t=i t=i 

and X^_^ — {X[_i, . . . ,X[_p)'. If we suppose that the true unconditional variances 
Et :— HfH^ are known, we can define the following Generalized Least Squares (GLS) 
estimator 
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:^\ec (±x) . (2-3) 
with 

n n 

% - E ^f-i^'-i ® ^t' and tx = ^ Sr'^t^f' i ■ 
t=i t=i 

Let us define ut{9) = Xt - {Xf_i ® ld)0 with 6 (^W^ . Note that 

^GLS maximizes 

the log-Hkehhood function (up to a constant and divided by n) 

1 " 

Cgls{0) = - E 1" {det(EO} - ut{e)'i:^^ut{e), (2.4) 

(see Liitkepohl (2005, p 589)). If we assume that the innovations process volatihty is 
constant (Et — S]„ for aU t) and unknown, the standard log-hkehhood function 

1 1 " 

CoLs{0, S) = -- ln(det(S)) - i^Y. (2.5) 

t=i 

where S is a dx d invertible matrix, is usually used for the estimation of the parameters. 
The estimator obtained by maximizing Cqls with respect to 9 corresponds to Oqls- 
In this case the estimator of the constant variance I]„ is given by := n^^ X]"=i "'^t'^^t 
where iit := ut{9) are the residuals of the OLS estimation of 



In practice the assumption of known variance is not realistic. Therefore we consider 
an adaptive estimator of the autoregressive parameters. We may first define adaptive 
estimators of the true unconditional variances ;= HtH[ as in Patilea and Raissi 
(2010) 

n 

Et = Wf,(6)uj-a-, 

i=l 

where the weights wu are given by 

wuib) = (^Ku{b)^ Ku{b), 

with b the bandwidth and 



Ku{b) 



if t^i, 



Identification of stable VAR models 7 

where K{.) is the kernel function which is such that K{z)dz = 1. The bandwidth 
b is taken in a range B„ = [cminbn, Cmaxbn] with Cmax > Cmin > some constants and 
6„ 4 at a suitable rate. Alternatively one can use different bandwidths cells for the 
St's (see Patilea and Rai'ssi (2010) for more details). The results in this paper are given 
uniformly with respect to 6 € B„. This justifies the approach which consists in selecting 
the bandwidth on a grid defined in a range using for example the cross validation 
criterion. Note also that the St's are positive definite. Of course our results do not rely 
on a particular bandwidth choice procedure and are valid provided estimators of the 
S^'s with similar asymptotic properties of the St's are available. The non parametric 
estimator of the variance employed in this paper is similar to the volatility estimators 
used in Xu and Phillips (2008), Beare (2008), in the univariate case or Boswijk and Zu 
(2007) in the multivariate case among other references. Considering the St's, we are 
in position to introduce the ALS estimators 

Oals = S^'vec (Ex) , 

with 

n n 

= E ^f-i^f-i ® and tx = ^ t-^X^xf,. 

t=i t=i 



Now we have to state the asymptotic behavior of the estimators and introduce some 
notations. Define 

/ Ai ... Ap_i 
Id ... 



Ap \ 



V Id J 

and ep(l) the vector of dimension p such that the first component is equal to one and 
zero elsewhere. Under Al it is shown in Patilea and Rai'ssi (2010) that 



V^iOoLS - eo) ^ AA(0, A3IA2A3 1), 



(2.6) 
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with 

„1 oo 

A2= / ^{A^(ep(l)ep(l)'«)S(r))A^'}®S(r)dr, 

•^0 i=0 

„1 oo 

A3= / ^{A*(ep(l)ep(l)'0S(r))A^'}0 7drfr, 

i=0 

and 

V^{OGLS-Oo)^Af{0,K'), (2-7) 

where 

„1 oo 

Ai = / ^ { A'(ep(l)ep(l)' S(r))A*' } S(r)-i dr. 
In addition we may use the following consistent estimators for the covariance matrices: 

n 

= Ai+Op(l), = A3+0p(l) and As := Xf_^xf_i(g)UtUt = A2+Op(l). (2.8) 

t=i 

We make the following assumptions to state the asymptotic equivalence between the 
ALS and GLS estimators. 

Assumption Al': Suppose that all the conditions in Assumption Al hold true. 
In addition: 

(i) infrg(o,i] Amin(S(r)) > where for any symmetric matrix M the real value 
\nin{M) denotes its smallest eigenvalue. 

(ii) supt llcfetlls < 00 for all k e {l,...,d}. 

Assumption A2: (i) The kernel K{-) is a bounded density function defined on 
the real line such that K{-) is nondecreasing on (— oo,0] and decreasing on [0,oo) and 
J^v'^K{v)dv < oo. The function K{-) is differentiable except a finite number of points 
and the derivative K'{-) is an integrable function. Moreover, the Fourier Transform 
J^[K]{-) of K{-) satisfies \sJ^[K]{s)\ ds < oo. 

(ii) The bandwidths bki, I < k < I < d, are taken in the range Bt = [cminbT, Cmaxbr] 
with < Cmin < Cmax < OO and 6t + 1/Tb'p''^ — )• as T oo, for some 7 > 0. 

(iii) The sequence vt is such that Tu^ 0. 



Under these additional assumptions Patilea and Rai'ssi (2010) also showed that 
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yP^{0ALS-0GLs) = Op{l), (2.9) 

and 

/O POO 
K{z)dz + ^{r+) / K{z)dz, (2.10) 
-oo Jo 

As a consequence 0als a-nd ^gls have the same asymptotic behavior and we can also 
write = Sf + Op(l), unless at the break dates where we have St = Sf + Op(l). 
Using these asymptotic results we underline the unreliability of the standard AIC and 
develop a criterion which is adapted to the case of non stationary but stable processes. 
Corrected confidence bounds for the PAM and PCM in our non standard framework 
are also proposed. 



3. Derivation of the adaptive AIC 

In the standard case (the volatility of the innovations is constant with true variance 
E„) the KuUback-Leibler discrepancy between the true model and the approximating 
model with parameter vector Oqls is given by 



dniOoLS,(^a) — £'eo,s„ {-2£oLs(^', 5]„)} \g^§^^^^, (3.1) 

see Brockwell and Davis (1991, p 302). Akaike (1973) proposed the following approx- 
imately unbiased estimator of p.ip to compare the discrepancies between competing 
VAR(p) models 

AlCip) = -2CoLs{0oLS,^u) + 

n 

where the term 2pd^ penalizes the more complicated models fitted to the data (see 
Liitkepohl (2005), p 147). The terms corresponding to the nuisance parameters are 
neglected in the previous expressions since they do not interfere in the model selection 
when the AIC is used. The identified model corresponds to the model which minimizes 
the AIC. However in our non standard framework it is clear that the Cqls cannot 
take into account the non constant variance in the observations. In addition if we 
assume that the volatility of the innovations is constant Ef = E„, we obtain 
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V^iOoLS - Oo) ^ AA(0, A4 1), (3.2) 

with A4 = E(Xt-iXf_i) (E) S"^, so that the following result is used for the derivation 
of the standard AIC 



Eoo.^^ {n{0oLS - eoYMOoLS -do)}- pd\ 

for large n, where A4 = n^^ 12^=1 -^t-i^t-i ^ consistent estimator of A4. In 

view of p.6|l this property is obviously not verified in our case. Indeed Patilea and 
Rai'ssi (2010) pointed out that A4 and A3"^A2A3 ^ can be quite different. Therefore the 
standard AIC have no theoretical basis in our non standard framework and we can 
expect that the use of the standard AIC can be misleading in such a situation. 



To remedy to this problem we shall use the more appropriate expression (|2.4|) in 
our framework for the KuUback-Leibler discrepancy between the fitted model and the 
true model 



Using a second order Taylor expansion of Cgls about 9gls a-nd since g^- 

we obtain 



(3.3) 

= 0, 



Eg, {-2Lgls{0o)} = ^Ee„ [n{eGLS - e^yt^iOcLS - Oo)] 

+Ee„ [-2Cgls{0gls)} + o{l). (3.4) 

Using again the second order Taylor expansion and taking the expectation we also 
write 

1 



E, 



eo 



Eg, {-2CGLsid)} 



-E, 



Bo 



^0GLS — do)'Eeo {^x} {^GLS — ^0) 



+EeA-'^'^GLs{Oo)}+o{l). 



(3.5) 



From (|2.7p we have for large n 



Eg, niOcLS - OoYt^idGLS - do) ~ pd 
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and 



Eg, [niOoLS ~ Oo)'Eg, {%} [Ogls - ^o)] « pd 



Noting that 



+Eg„ Eg, {-2£GLs{d)} Ig^OaLS ~ {-'^^GLs{(^q)} 

+Eg„ {-2CGLsiOo)} - Ee„ {-2CGLsi0GLs)} 
and using p. 41) and p. 51) , we see that the following criterion based on the GLS estimator 

AICgls = -2Cgls{0gls) + ^ 

n 

is an approximately unbiased estimator of An{OGLS^ ^o)- 

Nevertheless the AICg ls is infeasible since it depends on the unknown volatility of 
the errors. Thereby we will use the adaptive estimation of the variance structure to 
propose a feasible selection criterion. Recall that 

1 " 

-'^CGLsiOGLs) ^-y^\n{det{Y.t)} + Ut{6GLs)'^t^nt{eGLs), 

and define 

1 " 

-2Cals{0als) = - Y,^n{dei{tt)] +ut{eALs)'ti^ut{eALs)■ 
t=l 

In view of (|2?9l) and (|27TOl) we have 

-2Lgls{0gls) = -2Lals{0als) + Op(l), 

since we allowed for a finite number of volatility breaks for the innovations. Therefore 
we can introduce the adaptive criterion 

2nd'^ 

AICals = -2Cals{0als) + 

n 

which gives an approximately unbiased estimation of (j3.3p for large n. 



12 



Finally note that if we suppose that (Xt) is cointegrated, Kim and Park (2010) 
showed that the long run relationships estimated by reduced rank are n-consistent 
(see section 2.3 of Cavaliere et al (2010) for a detailed discussion on the concept of 
cointegration in our framework). Therefore our approach for building information 
criteria can be straightforwardly extended to the cointegrated case since it is clear that 
the estimated long run relationships can be replaced by the true relationships in the 
preceding computations. 

4. Identifying the lag length using partial autoregressive and partial 

correlation matrices 

In this part we assume p > po, so that the cut-off property of the presented tools 
can be observed. Following the approach described in Reinsel (1993) chapter 3, one 
can use the estimators of the autoregressive parameters to identify the lag length of 
(|2.ip . Consider the regression of Xt on its past values 

Xt = AoiXt-i + ■ ■ ■ + AopXt-p + ut. (4.1) 

We can remark that the partial autoregressive matrices Aop(,+i, . . . ,Aqp are equal to 
zero. The PAM are estimated using OLS or ALS estimation. Confidence bounds for 
the PAM can be proposed as follow. Let us introduce the (P{p — po) x (fp dimensional 
matrix R ~ (0, /(j2(-p_pjj)), so that from (|2.6p . (12. 7p and (|2.9p we write 



y/7iR6oLS^ ■N'{'^,RK^J^2l^s^R') and ^/7[ReALS ^ ^f{0, R'), (4.2) 

where ROqls and ROals correspond to the OLS and ALS estimators of the null 
matrices Aop^+i, . . . , Aop. Denote by vf^^ (resp. vf^^) the asymptotic standard 
deviation of the ith component of Oqls (resp. Oals) for * G {d?Pi) + li • • • ; d^p} with 
obvious notations. From (|4.2p the i-th component of Oqls (resp. Oals) are usually 
compared with the 95% approximate asymptotic confidence bounds ±1.96t)f ^"^ (resp. 
±1.960^^'^) as suggested in Tiao and Box (1981). The wf^-^'s and vf^^'s can be 
obtained using the consistent estimators in (|2.8p . Therefore the identified lag length for 
model (j2.ip correspond to the higher order of the matrix A^i which have an estimator 
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of a component which is clearly beyond its confidence bounds about zero. 

The identification of the lag length of standard VAR processes is usually performed 
using also the partial cross-correlation matrices which are the extension of the partial 
correlations of the univariate case. Consider the regressions 

Xt-p = (j^iXt-p+i . . . (j)p-iXt^i + wt, 

Xt = AoiXt-i + • • • + Aop^iXt-p+i + ut, 

with p > I. In our framework it is clear that the error process (wt) is unconditionally 
heteroscedastic, and then we define = lim„^oo n^^ StLi -^("^tw^) which converge 
under Al, and the consistent estimator of S^,: 



t=p 

[ t=p ) [ t=p ) [ t=p ) 

with obvious notations. The consistency of this estimator can be proved from standard 
computations and using lemmas 7.1-7.4 of Patilea and Raissi (2010). We also define the 
'long-run' innovations variance = lim„^oo X]r=i ^("*^t) where the i?(utMj)'s 
are non constant and the consistent estimator E„ = n^^ S"=i ^^t^t of S„ where we 
recall that the ut& are the OLS residuals. 

Several definitions for the partial cross-correlations are available in the literature. 
In the sequel we concentrate on the definition given in Ansley and Newbold (1979) 
which is used in the VARMAX procedure of the software SAS. We propose to extend 
the partial cross-correlation matrices in our framework as follow 



P{p) = {^u^ (8)E^„^j vec|n-i^i;(wtu;)| = ^ (g) j vec(^p) (4.3) 

and it is clear that for p > po we have P{p) =0. The expression ()4.3p may be viewed as 
the 'long-run' relation between the Xf's and the Xt-p's corrected for the intermediate 
values for each date t. Consider the OLS and ALS consistent estimators 
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PoLsip) = [tJ(^t.Vj yeciMoLs) 

pALsip) - ^tt^j vec{R§ALs), 

where 7? = (0, 1^2) is of dimension (P x (Pp, so that ROqls a-i^d R^als correspond to 
the ALS and OLS estimators of Ap in (|i?T|l . Using again (|^ . (P?7|) . and from 

the consistency of S ^ and S ,^ we obtain 

n^PoLsip)^^(0, (s^^ ^sl) (i^A^iAsA^i/?') (s^^ ® (4.4) 

n^P^Lsb) ^ AA(o, (Su^ ® si) (i?Ar'i?') (s;^ ® . (4.5) 

Hence approximate confidence bounds can be built using ()4.4j) and (|4.5p . Similarly to 
the partial autoregressive matrices the highest order p for which a cut-off is observed for 
an element of Pols{p) ( resp. Pals{p)) correspond to the identified lag length for the 
VAR model. Note that for p = I {po = 0, so that the observed process is uncorrelated 
and wt = Xt-i, ut = Xt), we have hm„^oo Z)"=i E{XtX[) = = E„. In this 
case similar results to ()4.4p and (|4.5p can be used. 

Let us end this section with some remarks on the OLS and ALS estimation ap- 
proaches of the PAM and PCM. If we assume that the variance of the error process is 
constant, the result p.2[) is used to identify the autoregressive order using the partial 
autoregressive and correlation matrices obtained from the OLS estimation. However 
as pointed out in the previous section this standard result can be misleading in our 
framework. From the real example below it appears that the OLS PAM and PCM 
with the standard confidence bounds seem to select a too large lag length. Note that 
since the tools presented in this section are based on the results (|2.6p . (|2.7p and on 
the adaptive estimation of the autoregressive parameters, they are able to take into 
account changes in the volatility. 

In the univariate case the partial autocorrelation function is used for identifying the 
autoregressive order. In such a case the asymptotic behavior of 9als does not depend 
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on the volatility structure (see equation (|4.6p below). Hence the ALS estimators of 
the partial autocorrelations do not depend on the volatility function on the contrary 
to the OLS estimators. In the general VAR case Patilea and Rai'ssi (2010) also showed 
that that A3"^A2A3"^ — is positive semi-definite (the same result is available in the 
univariate case). Therefore the tools based on the ALS estimator are more accurate 
than the tools based on the OLS estimator for identifying the autoregressive order. 

We illustrate the above remarks by considering the following simple case where we 
assume that E(r) = a{r)'^Id with (t(-)^ a real- valued function. The univariate stable 
autoregressive processes are a particular case of this specification of the volatility. In 
this case we obtain 

oo 

Ai = A4 = ^ {A*(ep(l)ep(l)' ® Id)A'' } <E> Id (4.6) 



and 



so that we have 



i=0 



A2 = / a{rfdr Ai, A3 = /" (j{r)dr Ai, 
Jo 



A-A^A- ^ a:' (47) 

Jo '^ir)dr) 



with a{r)^dr > (^J^^ a{r)drj from the Jensen inequality. Hence from (I4.7p it is 
clear that the adaptive PAM and PCM are more reliable than the PAM and PCM 
obtained using the OLS approach. In addition from (|4.6I) the asymptotic behavior of 
the adaptive partial autoregressive and partial correlation matrices does not depend on 
the volatility function. On the other hand we also see from ()4.7p that the matrices A4 
and A;^^A2A3"^ can be quite different. Therefore since the standard results do not take 
into account for time- varying variance, the standard bounds for the standard PAM and 
PCM may be misleading for identifying the lag length of autoregressive models. 



5. Empirical results 

For our empirical study the AICals is computed using an adaptive estimation of 
the variance as described in Section [51 The ALS estimators of the PAM and PCM are 
obtained similarly. In particular the bandwidth is selected using the cross-validation 
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method. The OLS partial autoregressive and partial correlation matrices used with 
the standard confidence bounds are denoted by PAMs and PC Ms- Similarly we 
also introduce the PAMqls, PAMals and the PCAIols, PCMals with obvious 
notations. In the simulation study part the infeasible and AICgls ai'e used only for 
comparison with the feasible AICals- 

It is important to note that when the PAM and PCM are used, the practitioners 
base their decision on the visual inspection of these tools (see our real data study 
below). For instance it is well known that there are cases where the PAM and 
PCM are beyond the confidence bounds but not taken into account for the lag length 
identification. Therefore simulation results concerning the PAM and PCM do not 
really reflect their ability to identify the lag length in practice and are not displayed 
since that in this case the lag length is automatically selected over the iterations. The 
use of the modified PAM and PCM is illustrated in the real data study. 

For a given tool we assume in our experiments that when the selected autoregressive 
order is such that p > 5, the model identification is suspected to be not reliable (for 
instance the more complicated models seems not enough penalized by the information 
criterion). In such situations the practitioner is likely to stop the procedure. 

5.1. Monte Carlo experiments 

In this part N = 1000 independent trajectories of bivariate VAR(2) {po — 2) 
processes of length n — 50, n — 100 and n = 200 are simulated with autoregressive 
parameters given by 



Recall that the process (et) is assumed iid standard Gaussian. In order to study 
the behavior of the information criteria when the innovation process (ut) is in fact 
homoscedastic, we consider the case Sf = I2 for all t. The results are given in Tables 
[TJ21 Two kinds of volatilities are used for our experiments in the heteroscedastic case. 
When the variance smoothly change in time we consider the following specification 
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^(r) =1 ,1 I • (5-1) 

p(l +7ir)2(l +72r)2 (1 + 72?^) 

In case of abrupt change the fohowing volatihty specification is used 



= ^ ^ o ' (5.2) 

with = (7i — l)l(r>i/2)(7')- In this case we have a common volatihty break 

at the date t = n/2. In all the experiments we take 71 — 20, 72 — 7i/3 and p = 
0.2. The results are given in Tables HHH] for the volatility specification (|5.ip and in 
Tables [7]|9] when specification ()5.2p is used. To facilitate the comparison of the studied 
identification tools, the most frequently selected lag length is in bold type. 

The small sample properties of the different information criteria for selecting the 
autoregressive order is analyzed. According to Tables [l]|9] we can remark that the AIC 
selects too large lags lengths in both constant and non constant innovations volatility. 
This is in accordance with the fact that AIC is not consistent (see e.g. Paulsen (1984) 
or Hurvich and Tsai (1989)). On the other hand we can remark that the AICals 
and AICgls a-re selecting most frequently the true autoregressive order, and select 
p > pi:, only in few cases. We also note that the frequency of selected true lag length 
p = po = 2 increase with n for the AICgls and AICals- It appears that the results 
for the AICgls are similar for the honioscedastic and heteroscedastic framework. This 
can be explained by the fact that the volatilities specifications are assumed known in 
the GLS approach, and hence do not interfere much in the lag length selection. The 
infeasible AICgls provide slightly better results than the AICals- As expected it 
can be seen that the difference between the AICgls and AICals seems more marked 
when the processes display an abrupt volatility change. This may be explained by the 
fact that in view of (|2.10p the volatility is not consistently estimated at the break dates. 
Nevertheless such bias is divided by n, and we note that the behavior of the AICgls 
and AICals become similar as the samples increase in all the studied cases. According 
to our simulation results it appears that the adaptive AIC is more able to select the 
appropriate autoregressive order than the standard AIC when the underlying process 
is indeed a VAR process. 
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5.2. Real data study 

In this part we try to identify the VAR order of a bivariate system of variables 
taken from US international finance data. The first differences of the quarterly US 
Government Securities (GS hereafter) hold by foreigners and the Foreign Direct In- 
vestment (FDI hereafter) in the US in billions dollars are studied from January 1, 1973 
to October 1, 2009. The length of the series is n = 147. The studied series are plotted 
in Figure [5TT] and can be downloaded from website of the research division of the federal 
reserve bank of Saint Louis: www.research.stlouisfed.org. 

We first highlight some features of the studied series. The OLS residuals and the 
variances of the errors estimated by kernel smoothing are plotted in Figure [521 From 
Figure [5TT] it appears that the data do not have a random walk behavior. From Figure 
15.21 the estimated volatilities seem not constant. The residuals plotted in Figure 15.21 
show that the variance of the first component of the residuals seems to be constant from 
January 1973 to October 1995 and then we may suspect an abrupt volatility change. 
Similarly the variance of the second component of the residuals seems constant from 
January 1973 to July 1998 and then we remark an abrupt volatility change. Therefore 
it clearly appears that the standard homoscedasticity assumption is not realistic for 
the studied series. 

We fitted VAR(p) models with p e {1, . . . , 5} to the data and computed the AIC 
and AICals for each p. In our VAR system the first component corresponds to the GS 
and the second corresponds to the FDI. From Tabic [TU] the AIC is decreasing as p is 
increased so that the higher autoregressive order p = 5 is selected, while the minimum 
value for the AICals is attained for p = 2. If it is assumed that the studied processes 
follow a VAR model and since we noted that the variance of the studied processes 
seem non constant, it is likely that the AIC is not reliable and selects a too large 
autoregressive order. In view of our above results the model identification with the 
more parsimonious AICals seems to be more reliable. We also considered the PCM 
obtained from the standard, OLS and ALS estimation methods. The PCM are plotted 
in Figures 15.31 and 15.41 and it appear that we can identify p — 2 using the modified 
tools while p = 3 could be identified using the standard PCM. We also see that the 
standard and OLS confidence bounds can be quite different. The PAM are given below 
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with the 95% confidence bounds into brackets. We base our lag length choice on the 
PAM which are clearly greater than its 95% confidence bounds (in bold type). The 
PAMs give for the studied data: 



_ I ^0-35[±o.l6] 0-12[±0.19] j _ I ~0-55[±0.17] 0.08[±o.23] 

0-06[±o.i4] -0.72[_|_o / \ 0.08[±o.i4] -0.27[±o.2o] 



_ ( ^O-^^lio.ls] 0.07[±o.23] I J[S _ I ^0-25[±0.17] 0.03[±o.24] 

0.15[±o.i6] -0.20[±o.2o] / ^ V -0-04[±o.i5] -0.03[±o.2i] 



aS _ ( -0-06[±o.i7] -0.04[±o.i9] 

^5 — 



-0.02[±o.i5] -0.05[±o.i7] 
We obtain the following PAMqls 



^OLS ^ I -0.35[±o.23l 0.12[±o.23] I ^OLS ^ ( -0.55[±o.27] 0.08[±o.35] 

0.06[±o.i9] -0.72[_|_o.i9] / \ 0.08[±o.24] -0.27[±o.36] 



OLS _ I ^'^•32[±0.31] 0.07[±o.35] \ j^OLS _ ( ~0.25[±o.24] 0.03[±o.26] 
V 0.15[±o.l8] -0.20[±o.31] / ^ V -0-04[±0.26] -0.03[±o.27] 



aOls _ ( -0-06[±o.25] -0.04[±o.l7] 

^5 



-0.02[±o.i9] -0.05[±o.24] 



and the following PAMals 



^ALS ^ I -0-42[±0.18] 0.06[±o.24] I ^ALS ^ I -0-58[±0.19] 0.07[±o.30] 

0.02[±o.ii] -0.70[±o.2i] / ^ V 0.03[±o.i2] -0.26[±o.26] 



ALS _ i -0.21[±o.21] 0.07[±o.3O] I j^ALS ^ ( -0-20[±0.20] 0.06[±o.31] 
0.08[±o.l3] -0-21[±o.26] / \ 0.02[±o.l3] -0-01[±o.27] 



aALS _ ( -0-13[±0.19] 0.06[±o.26] 
^5 — 



-0.01[±o.l2] -0.07[±o.22] 
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It can be seen that the PAMqls and PAMals in the Af^^ and if^-^ for i ^ 3,4 
and 5 seem not significant, so that one can identify p = 2 using our modified tools. 
The cut-off at p = 2 is clearly marked for the PAMqls and PAMals- If the PAMs 
are used we note that one could select again p = 3 or even p — 4, and we note that the 
cut-off is not so clearly marked in this case. We also see that the 95% standard and 
OLS confidence bounds can be quite different. If the length was automatically selected 
using the PAM and PCM, larger lag lengths would have been chosen. Indeed we note 
that some of the PAM and PCM are only slightly beyond the 95% confidence bounds 
(see for instance the -Pols'(3), -Pols(4) or -Pals(3) in Figures [Ol and . 

In general it emerges from our empirical study part that the standard identification 
tools lead to select large lag lengths for the VAR models with non constant variance. 
This may be viewed as a consequence to the fact that the standard tools are not 
adapted to our non standard framework. Note that the identification of the model is 
the first step of the VAR modeling of time series. In such situation the practitioner is 
likely to adjust a VAR model with a too large number of parameters which can affect 
the analysis of the series. The identification tools developed in this paper take into 
account for unconditional heteroscedasticity. From the real data study we found that 
the modified tools are more parsimonious. 
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Appendix: Tables and Figures 



Table 1: Frequency (in %) of selected lag length. The simulated processes are of length 
n = 50 with homoscedastic variance. 



p 


1 


2 


3 


4 


5 


AIC 


0.0 


0.0 


0.0 


0.0 


100.0 


AICals 


5.0 


77.1 


10.1 


4.6 


3.2 


AICgls 


5.2 


86.0 


6.4 


1.7 


0.7 



Table 2: The sarne as in Table [T 



p 


1 


2 


3 


4 


5 


AIC 


0.0 


0.0 


0.0 


0.0 


100.0 


AICals 


0.3 


83.3 


11.4 


3.5 


1.5 


AICgls 


0.2 


86.6 


9.9 


2.4 


0.9 



but for n = 100. 



Table 3: The same as in Table [T1 but for n = 200 



p 


1 


2 


3 


4 


5 


AIC 


0.0 


0.0 


0.0 


0.0 


100.0 


AICals 


0.1 


86.0 


8.3 


4.6 


1.0 


AICgls 


0.0 


89.7 


6.3 


3.6 


0.4 
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Table 4: Frequency (in %) of selected lag length. The simulated processes are of length 



specified as in (15.1 




P 


1 


2 


3 


4 


5 


AIC 


1.5 


0.2 


0.1 


0.1 


98.1 


AICals 


8.0 


58.4 


15.6 


9.0 


9.0 


AICgls 


7.2 


84.0 


6.4 


1.8 


0.6 



Table 5: The sanie as in Table |4 



p 


1 


2 


3 


4 


5 


AIC 


0.0 


0.0 


0.0 


0.0 


100.0 


AICals 


2.4 


78.1 


11.9 


5.5 


2.1 


AICgls 


0.2 


86.6 


9.8 


2.4 


1.0 



but for n = 100. 



Table 6: The same as in Table [il but for n = 200 , 



p 


1 


2 


3 


4 


5 


AIC 


0.0 


0.0 


0.0 


0.0 


100.0 


AICals 


1.6 


84.1 


9.3 


3.8 


1.2 


AICgls 


0.0 


89.1 


6.6 


3.8 


0.5 



Table 7: Frequency (in %) of selected lag length. The simulated processes are of length 
n = 50 with vari 



specified as in (15.2 




P 


1 


2 


3 


4 


5 


AIC 


3.3 


0.2 


0.4 


0.7 


95.4 


AICals 


19.2 


44.7 


16.6 


11.3 


8.2 


AICgls 


7.5 


79.8 


8.3 


3.0 


1.4 



Table 8: The same as in Table [71 but for n = 100 



p 


1 


2 


3 


4 


5 


AIC 


0.4 


0.1 


0.0 


0.0 


99.5 


AICals 


19.3 


63.7 


11.2 


4.3 


1.5 


AICgls 


0.3 


84.6 


9.9 


3.5 


1.7 



Table 9: The sanie as in Table [7 



p 


1 


2 


3 


4 


5 


AIC 


0.0 


0.0 


0.0 


0.0 


100.0 


AICals 


3.9 


77.3 


11.1 


5.6 


2.1 


AICgls 


0.0 


86.5 


8.2 


4.3 


1.0 



but for n = 200. 
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Table 10: The quarterly foreign direct investment and government securities hold by 
foreigners for the U.S. (n = 147): The selected autorcgrcssivc order using AIC and AICals- 



p 


1 


2 


3 


4 


5 


AIC 


13.91 


13.73 


13.59 


13.53 


13.52 


AICals 


11.80 


11.61 


11.74 


11.77 


11.80 




I 1975.4 1980,4 1985.4 1990.4 1995.4 2000.4 2005.4 I 1975.4 1980.4 1985.4 1990.4 1995.4 2000.4 2005.4 

Figure 5.1: The differences of tfie government securities hold by foreigners (on the left) and of the 



foreign direct investment (on the right) in billions dollars (n = 147). 
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Figure 5.2: The «fj's (full line) and the non parametric estimation of Var(Mit) (dotted line) on the 




Figure 5.3: The OLS partial correlation matrices. The 95% OLS confidence bounds (in full lines) 
are obtained from 1 14.41 1 while the 95% standard confidence bounds (in dotted lines) are obtained using 
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Figure 5.4: The ALS partial correlation matrices. The 95% confidence bounds are obtained from 
g3) 
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